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Abstract. Using fiducial markers on patient's body surface to predict the tumor 
location is a widely used approach in lung cancer radiotherapy. The purpose of this 
work is to propose an algorithm that automatically identifies a sparse set of locations 
on the patient's surface with the optimal prediction power for the tumor motion. In 
our algorithm, it is assumed that there is a linear relationship between the surface 
marker motion and the tumor motion. The sparse selection of markers on the external 
surface and the linear relationship between the marker motion and the internal tumor 
motion are represented by a prediction matrix. Such a matrix is determined by solving 
an optimization problem, where the objective function contains a sparsity term that 
penalizes the number of markers chosen on the patient's surface. Bregman iteration is 
used to solve the proposed optimization problem. The performance of our algorithm 
has been tested on realistic clinical data of four lung cancer patients. Thoracic 4DCT 
scans with 10 phases are used for the study. On a reference phase, a grid of points 
are casted on the patient's surface (except for patient's back) and propagated to other 
phases via deformable image registration of the corresponding CT images. Tumor 
locations at each phase are also manually delineated. We use 9 out of 10 phases of the 
4DCT images to identify a small group of surface markers that are most correlated with 
the motion of the tumor, and find the prediction matrix at the same time. The 10th 
phase is then used to test the accuracy of the prediction. It is found that on average 
6 to 7 surface markers are necessary to predict tumor locations with a 3D error of 
about 1mm. It is also found that the selected marker locations lie closely in those 
areas where surface point motion has a large amplitude and a high correlation with 
the tumor motion. Our method can automatically select sparse locations on patient's 
external surface and estimate a correlation matrix based on 4DCT, so that the selected 
surface locations can be used to place fiducial markers to optimally predict internal 
tumor motions. 
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1. Introduction 

Modern radiotherapy techniques, such as Intensity Modulated Radiation Therapy 
(IMRT), are capable of delivering highly conformal radiation dose to a cancerous target 
while sparing critical structures and normal tissues. Intra-fraction tumor motion caused 
by patient respiration, however, may lead to geometric miss of the target and hence 
potentially compromise the efficacy of these techniques while treating tumors at lung 
or upper abdomen area. To mitigate this problem, a number of techniques have been 
developed, such as gated treatment, for which accurate modeling and prompt prediction 
of tumor motion are necessary (Jiang, 2006 b; Jiang, 2006 a). 

Tumor localization methods can be generally categorized according to the locations 
of surrogates. Methods using internal surrogates, such as gold markers implanted in 
or near tumor, are accurate but have issues like the risks of pneumothorax for lung 
cancer patients (Arslan et al., 2002; Geraghty et al., 2003), marker migration (Nelson 
et al., 2007), and the extra imaging radiation dose (Jiang, 2006 b). In contrast, external 
surrogate based tumor localization is usually noninvasive and radiation free. In such 
methods, a regression model is first built between the coordinates of some empirically 
selected external surrogates and those of the tumor using a training data set. Such a 
model will be utilized to predict the tumor location using the real-time measurements of 
the marker locations during a treatment via, for example, Cyberknife Synchrony system 
(Accuray Corporate, Sunnyvale, CA, USA) (Pepin et al., 2011). Yet, the accuracy of this 
method usually relies on the correlation between external marker motion and internal 
tumor motion for a particular patient (Hoisak et al., 2004). 

In fact, there are a few questions one should keep in mind while using external 
markers for tumor tracking. First of all, how many external markers are necessary? 
While using more markers may potentially provide more comprehensive information for 
tumor location estimation, it is evident that the motion of points on a patient surface is 
strongly correlated and information from many surface markers is likely to be redundant. 
Clinically, it is necessary and desirable to use a minimum number of markers to predict 
the tumor motion to a satisfactory degree. Second, given the number of markers, where 
shall we optimally place them? Despite a lot of studies regarding the patient breathing 
pattern and the selection of marker locations(Yan et al., 2006; Wu et al., 2008), markers 
are placed empirically in most clinical practice. 

In this study, we will attempt to answer the aforementioned two questions utilizing 
a sparse optimization approach. Specifically, our objective is to choose a sparse set of 
points from all the points on the front surface of a patient, so that a linear motion model 
yields the smallest error in tumor location prediction. With a novel optimization model 
to formulate this objective in a clean and precise mathematical language, as well as 
an effective numerical algorithm to solve the problem, we can effectively yet efficiently 
identify the key surface points used to predict tumor motion. A linear regression model is 
also developed during the optimization process, such that those markers collaboratively 
predict tumor locations to a satisfactory extent. 
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2. Methods and Materials 

We start with an introduction of some notations. Denote Y G M 3xm as a 3 x m 
matrix whose column vectors are the three Cartesian coordinates of the center of 
the tumor at various times tj with j = 1,2, ... ,m. Suppose there are k candidate 
surface points available for tumor motion prediction. We denote the coordinates of 
the collection of all of those surface points at a given time tj as a column vector 
Xj = [xi(tj), X2(tj), . . . , Xk(tj)] T , where each vector X{ = [^1,^2,^3] for i = l,...,k 
contains three entries corresponding to the three Cartesian coordinates of the point %. 
If we assemble all the collections of markers Xj associated with different time tj, we will 
have the following matrix X := [X u X 2 , ■ ■ ■ , X m ] E R 3kxm . 

2.1. Optimization Model 

Assume, for simplicity, there is a linear motion model that relates the external marker 
motion and the tumor motion. Mathematically speaking, there exist a matrix A e ]R 3x3A: 
such that AX ~ Y. Note that the columns of the matrix A can be also associated to 
those k surface points, each with three coordinates. If one column of the matrix A is 
non-zero, the corresponding coordinate for that surface point is then utilized to predict 
the tumor motion. As it is our purpose to select only a few surface points for tumor 
motion prediction, the problem can be casted as finding a matrix A with only a few 
non-vanishing columns, such that the motion of tumor recorded in Y can be accurately 
characterized by AX. Although this is simply a linear motion prediction model, our 
numerical experiments indicate that such an assumption is reasonable and leads to 
accurate tumor location estimations. We shall refer to the problem of optimal marker 
selection as the problem of finding the linear dependence of the motion of the internal 
tumor with the motion of some sparsely selected markers. 

We propose our optimal marker selection model as follows: 

mm{P|| 2il : AX = Y} , (1) 
1 

where ||A|| 2 ,i := X^(Sj a ?,j) 2 an d A = (a i; j). In this optimization problem, the 
objective function is defined in such a way that it groups all the matrix elements 
in a column of A utilizing an ^-norm and then takes £i-norm among all columns. 
Minimizing such an objective function term enables us to enforce sparsity at only the 
level of matrix columns. This idea is inspired by that of compressed sensing (Candes 
et al., 2006; Candes and Tao, 2006; Candes and Tao, 2005; Donoho, 2006), which is a 
recent revolutionary concept in information theory. The applications of such a £2,1 norm 
has been recently explored in many problems, such as beam orientation optimization 
for IMRT(Jia et al., 2011), to effectively select only a few groups of elements. Similar 
idea was also used in (Esser et al., 2011) where the £i tOQ norm was used for matrix 
factorization with applications in hyperspectral image unmixing. We remark that the 
model (1) not only sparsely selects markers needed to track the motion of an internal 
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so tumor, but also provides the linear dependence of the motion of the selected markers 
with that of the tumor at the same time. All such information is integrated within the 
solution matrix A. 



2.2. Fast Numerical Algorithm 

To solve the proposed optimization problem (1), we use a Bregman distance-based 
85 algorithm proposed by Yin et. al. (Yin et al., 2008), which is proven to be efficient for 
t\ minimization problems. Given matrices X and Y, the fast algorithm that solves (1) 
can be written into an iterative form as: 

A k+i = argm i n [ fI \\A\\ 2A + l\\AX -Y k \\ 2 F } , 

yfe+i =Y k + Y - A k+1 X, 

where k is the iteration index and || • \\p is the Frobenius norm. The optimization 
90 problem in the first subproblem of (2) can be solved using the proximal forward- 
backward splitting algorithm (Combettes and Wajs, 2006; Hale et al., 2007), which 
by itself is an iterative algorithm as: 

A p+1 = %{A P - 5(A P X - Y k )X T ), (3) 

where p is the iteration index in this subproblem and T^{B), for a given matrix 
B = [£?!, B 2 , . . . , B m ], is defined as 

%{B) := 

We note that (Donoho, 1995; Wang et al., 2007) T^{B) is the closed form solution 
95 to min {//||X|| 2 ,i + \\\X — -B|||i} . For computation efficiency, we shall not solve the 

subproblem (2) accurately by using numerous iterations of (3), but only use one iteration 
instead. Now, applying (3) (with only one iteration) to (2), we have the following fast 
algorithm that solves (1) (also known as the Bregmanized operator splitting algorithm 
(Zhang et al, 2010)): 

Algorithm 1 Optimal Marker Selection Algorithm 
Step 0. Initialization: k = 0, A = and Y° = 0. 

while stopping criteria is not satisfied do 
Step 1. 

A k+1 = %(A k - 5(A P X - Y k )X T ) 

Step 2. 

yfe+i _ Y k _|_ y — A k+1 X 

end while 



B B 
max(|5i| - At,0)— ^, • • • ,max(|S m | - /x,0)- 



ioo The proof of the mathematical properties of this algorithm, such as convergence, 
is beyond the scope of this paper. Interested readers can consult references for more 
details(Yin et al., 2008; Zhang et al., 2010). 
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For realistic patient data, because of the presence of noise and the fact that the 
motion of internal tumor is only approximately linearly dependent on the external 
markers, we should not expect the relative residual ||A fc X — Y ||f/||^||f decrease to 0. In 
fact, numerically we observe that the relative residual should have a lower bound whose 
value depends on X and Y and it is very difficult to estimate beforehand. Therefore, 
we adopt the following stopping criteria: 

\\A k X-Y\\ F \\A k - l -A k \\ F 

n < 61 01 \w\ F <62 ' 

In other words, we fix an ei as a satisfactory amount for the residual; meanwhile, if such 
residual is not attainable, we will terminate the algorithm when A k is not changing too 
105 much according to the tolerance €2- 



2.3. Patient Data 

To validate our algorithm with realistic clinical cases, 4DCT scan data of four lung 
cancer patients is used. For those patients, a four-slice GE LightSpeed CT scanner 
(GE Medical Systems, Milwaukee, WI, USA) was used to acquire the 4DCT data for 

no treatment simulation. Each axial CT slice has a thickness of 2.5mm and the 4DCT 
was obtained using respiratory signals from the Varian RPM system (Varian Medical 
Systems, Inc., Palo Alto, CA, USA). The 4DCT scan consists of ten different phases 
of one breathing cycle; and the CT volume at each respiratory phase consists of 100 to 
144 slices of CT images covering the most of thorax area depending on patients. Each 

115 slice of CT image has 512 x 512 pixels, with a pixel size of 0.977 x 0.977mm 2 . For each 
patient, tumor GTV was manually contoured on 4DCT scan images of ten respiratory 
phases by an expert observer and the 3D tumor center coordinates were identified. 
Table 1 summarizes the number of CT image slices for each CT image volume and 
the average tumor motion amplitude in the superior-inferior(S-I) direction and average 

120 surface motion amplitude for each patient. It can be observed that the average tumor 
motion amplitude in the S-I direction range from 3.3mm to 9.0mm. The average surface 
point motion amplitude ranges among all the patients are found to be 0.8mm to 2.0mm. 



Patient 


No. of slices 


Tumor motion ampli- 


Average surface mo- 






tude in S-I (mm) 


tion amplitude (mm) 


1 


144 


6.3 


2.0 


2 


100 


9.0 


1.5 


3 


132 


7.4 


1.8 


4 


104 


3.3 


0.8 



Table 1. Summary of patient data with number of CT slices, average tumor motion 
amplitude in S-I direction, and average surface motion amplitude for each patient. 

Meanwhile, the external surfaces of each patient, excluding the patient's back, at 
each phase are extracted by segmenting the CT images using a simple threshold method. 
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125 For each patient, the CT image volume at the end of inhale is set as the target image; 
the other nine CT image volumes, corresponding to the other nine different respiratory 
phases, are set as moving images. The correspondence between surfaces at different 
phases is established by deformable image registration (Thirion, 1998; Gu et al., 2010). 
When surface points are available on the external surfaces of each patient, we further 

no sub-sampled the point sets uniformly to reduce the total number of candidate points 
for a better computational efficiency In our experiments, we choose approximately 200 
candidate surface points for each patient. 

2.4- Validation 

To validate our marker selection algorithm, we employ an leave-one-out cross validation 

135 (LOOCV) method. Specifically, 10 tests are performed for a patient, and for each test, 
we single out one of the 10 respiratory phases and use the other 9 to form the matrix Y 
and solve for the matrix A using Algorithm 1. We then validate our method by using 
the matrix A to predict the location of the tumor at the phase that has been singled 
out. The deviation of the predicted tumor location from the actual tumor location is 

wo characterized by the 3D Euclidean distance between them in mm. 

The patients' 4DCT image volumes cover a complete breathing cycle, hence contain 
information of external surface motion. We could in principle identify regions of interest 
(ROIs) on the patient surface that strongly correlate with tumor motions. It is expected 
that the marker locations selected by Algorithm 1 should fall closely into those ROIs. 

145 This also serves as a criterion for the justification of the correctness of our marker 
selection algorithm. To select the ROI, we consider the following two metrics. First, 
from the deformation vector fields between different respiratory phases, the motion 
trajectory for all surface points were extracted. The correlation function between the 
internal tumor motion in the S-I direction and the motion vector of each point on the 

150 external surface was employed as a metric. However, only part of the external surface 
has considerable motion amplitude and those points with small motion amplitude should 
not be considered for predicting tumor motions despite their possible high correlations 
with tumor S-I motion. Therefore, we only focus on the surface region with large 
motion amplitudes. Combining the two criteria, we define the ROI as the areas on the 

155 surface in which the motion amplitude is larger than 80% of the maximum value and 
the correlation is above 0.85. Although those threshold values for the two criteria are 
chosen empirically, the general conclusions presented in the rest of this paper are found 
not sensitive to them. 

3. Results 

wo 3.1. Marker selection 

We have studied the validation of our surface marker selection algorithm on 4 lung 
cancer patients. The selected 6 surface markers in one typical patient (patient No. 4) 
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Figure 1. Left: Markers selected by our algorithm are shown as red circles on one of 
the patient's surface. Right: the LOOCV results for the same patient using the phases 
1 through 9 as training data (blue dots) and the phase 10 as the testing data (red dot). 
The red circle is the predicted tumor location. 

are drawn in 3D space on the patient surface, as shown in the left panel of Fig. 1. 
Meanwhile, in the right panel of Fig. 1, we demonstrate the LOOCV results for the 

165 same patient using the phases 1 through 9 as training data and the phase 10 as the 
testing data. Specifically, the blue dots are the locations of the tumor in the training 
phases and the red dot is the location of the tumor at the phase 10. The red circle is the 
predicted location using the selected surface markers and the matrix A. The 3D distance 
between the true tumor location and the predicted location is 0.83mm, indicating the 

170 great capability for tumor motion prediction of our algorithm. 



Patient 


Error 


^mm) 


^Markers 


Time 


(sec) 


mean 


std 


mean 


std 


mean 


std 


1 


1.85 


1.15 


5.5 


0.85 


10.6 


4.5 


2 


1.22 


1.06 


5.5 


1.58 


4.6 


1.9 


3 


0.44 


0.28 


5.4 


1.84 


10.8 


3.0 


4 


0.83 


0.29 


7.5 


1.35 


30.5 


11.6 


Average 


1.08 


0.69 


5.98 


1.04 


14.1 


5.2 



Table 2. Summary of tumor location prediction errors, the numbers of markers 
selected, and the computation time. 

A summary of the results of all 10 tests for each of the 4 patients is given in Table 2. 
For each patient, we compute the mean and the standard deviation of the 3D errors for 
the predicted tumor locations and the number of selected markers over all the 10 tests 
in the LOOCV. It is found that, on average, our algorithm can automatically select 
175 about 6 surface markers that collaboratively predict tumor motion with an 3D error 
about 1mm. 

Algorithm 1 is implemented using MATLAB on a laptop with Intel Core i7 (1.73 
GHz) CPU and 8.0G RAM. As for the computation time, it is found that the average 
time required to perform one optimization is about 14sec. We emphasize that the 
i8o time reported here is the one for marker selection. Once the markers are selected, the 
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matrix A becomes available. The prediction of tumor motion using selected markers 
only requires a simple matrix multiplication and hence the prediction can be achieved 
in a negligible amount of computation time. From Table 2, it is also found that the 
computational time for marker selection varies from case to case, which is possibly 
185 ascribed to the different patient sizes. 

3.2. Comparison with ROI 

The correlation between the internal tumor motion in the S-I direction and the external 
surface motion is shown on Fig. 2. In Fig. 3, we also present the amplitude of external 
surface motion. Combining the correlation map and the motion amplitude map, the 

190 ROIs for each patient can be identified, shown as red regions in Fig. 4, where the ROIs 
have correlation coefficients larger than 0.85 and surface motion amplitude greater than 
80% of the maximum value. Apparently, the ROIs are highly dependent on different 
breathing motion patterns among patients. We also plot in Fig. 4 the locations of 
markers selected with our algorithm. We can see that most of the marker positions 

195 selected by our algorithm fall inside or close to the ROIs, which indicate the robustness 
of our algorithm. 




Figure 2. Color maps showing the correlation coefficients between the external surface 
motion and the internal tumor motion for 4 patients. 
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Figure 4. Color maps showing the regions of interest (where the motion amplitudes 
are relatively large and the correlation coefficients are relatively high) and the locations 
of the selected markers. 
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4. Conclusions 

In this paper, we proposed a novel mathematical model to automatically determine the 
optimal number and locations of fiducial markers on patient's surface for predicting 

200 lung tumor motion. We also introduced an efficient numerical algorithm for solving 
the proposed model. Experiments on the 4DCT data of 4 lung cancer patients have 
shown that, by using our method, usually 6-7 markers are selected on patient's external 
surface. Most of these markers are in the regions where the surface motion is relatively 
large and the correction between the surface motion and the internal tumor motion is 

205 relatively high. Using these markers, the lung tumor positions can be predicted with an 
average 3D error of approximately 1mm. Both the number of markers and the prediction 
accuracy are clinically acceptable, indicating that our method can be used in clinical 
practice. 
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